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As  nearly  every  U.S.  college  applicant  can  attest,  the  majority 
of  domestic  colleges  and  universities  require  standardized  tests  far 
admission  to  imdeigraduatc,  graduale,  and  professional  programs. 
AlduHigh  controveniial  (Baron  &  Nonnan,  1992:  FairTest,  2006), 
tests  such  as  the  Schola.siic  Aptitude  Test  (SAT),  the  Graduate 
Record  Examinaiion  (GRIS),  and  others  are  valued  by  higher 
educational  institutions  as  predictors  of  first-year  student  grade 
ptiini  average  (Bridgeman,  McCamley-Jcnkins,  &  Ervin,  2000) 
and  graduate  schtKil  success  (Burton  &  Wang,  2CK)5)  and  as  an 
efficient  measure  of  underlying  traits  such  as  math  or  reading 
ability.  Given  the  widespread  use  and  high-stakes  nature  of  these 
assessments,  understanding  fuemrs  that  affect  test-taking  ability  in 
young  adults  is  vital. 


Factors  that  are  largely  determined  by  birth,  such  as  gender  and 
nice,  are  imponani  to  any  conversation  about  fair  and  equitable 
testing  (for  examples  of  these  types  of  studies,  see  Arbuthn<H, 
2tK)5:  Holland,  Hoffman.  &  Thompson,  2002;  Ramist,  Lewis,  & 
McCamley-Jenkins,  1994;  Schmitt  &  Dorans,  1990).  Although 
scholarly  attention  has  often  focused  on  these  birth  factors,  suffi¬ 
ciently  prevalent  acquired  characteristics  may  also  help  explain 
widespread  individual  differences  on  siandardJzed  tesLs.  In  partic¬ 
ular,  this  article  focuses  on  the  role  that  symptoms  of  posttraumatic 
stress  disorder  (IT^SD)  polentially  play  in  academic  assessmenis, 
FTSD  is  associated  with  symptoms  such  as  intrusive  thoughts, 
poor  concentration,  and  hypervjgilancc  to  threat  in  the  environ¬ 
ment  that  could  be  predicted  to  inieriere  with  test  taking.  Morc- 
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over,  a  growing  literature  suggests  that  PTSD  is  associated  with 
attention,  working  meinor},  and  oiher  cognitive  deficits  [Brew in, 
Kleiner,  Vasterling,  &  Field,  2007:  Hart  et  a]„  2008:  Vasterling  & 
Brailey,  2(K}5)  that  could  likewise  adversely  affect  performance  on 
standardized  academic  tests, 

A  continuum  of  posrtraumatic  stress  symptoms  (PSS),  including 
those  sufficiently  severe  as  to  reach  criteria  for  ITSD,  may  result 
from  exposure  to  any  extreme  traumatic  stressor  such  as  military 
cotnhat,  physical  and  sexual  assault,  child  abuse,  disasters,  or 
accidents  (American  Psychiadic  Association  [APAl,  2000),  In 
recent  U.S,  history.  Hurricane  Katrina,  Operation  Iraqi  Freedom 
f01F)/Operaiion  Enduring  Freedom,  and  the  terrorist  attacks  of 
September  1 1 , 200 1 ,  typify  events  that  might  trigger  PSS  or  PTSD, 
Current  diagnostic  class!  11  cat  ions  group  PTSD  symptoms  into 
tliree  clusters:  (a)  reexperiencing  of  the  traumatic  event  [e.g., 
nightmares,  intrusive  thoughts),  (b)  avoidance  of  stimuli  associ¬ 
ated  with  the  traumatic  event  and  numbing  of  general  responsive¬ 
ness  (e  g,,  restricted  range  of  affect,  loss  of  interest  in  previously 
engaging  activities),  and  (c)  increased  arousal  symptoms  (e.g,, 
poor  concentration,  sleep  disturbance). 

Many  Americans  are  exposed  to  inner-city  violence,  family 
vitjlence,  rape,  and  other  extreme  stress.  The  National  Comorbidity 
Study-Replicate  (Kessler  et  al.,  2(X)5)  estimated  the  lifetime  prev¬ 
alence  of  PTSD  in  a  nationally  representative  community-based 
sample  to  be  6.8%.  I'he  prevalence  oi'  PSS  imd  PTSD  may  be  even 
higher  in  at-risk  populations  such  as  w^ar-zone  veterans.  For  ex¬ 
ample,  according  to  a  major  study  of  Vietnam-era  veterans  (Kulka 
et  al.,  1990),  nearly  a  third  of  men  (30,9%)  and  over  a  quarter  of 
women  [26.9%)  who  served  in  Vietnam  experienced  PTSD  at 
some  point  in  their  lives,  with  an  additional  22.5%  of  men  and 
21.5%  of  women  experiencing  a  subset  of  PTSD  symptoms  that 
were  nofable  hut  not  sufficient  to  meet  full  diagnostic  criteria. 
Reanaiysis  of  a  male-only  subset  of  the  same  Vietnam  sample  but 
with  more  stringent  diagno.stic  criteria  found  an  adjusted  life  time 
PTSD  figure  of  18.9%  (Dohrenwend  et  al„2f)()6).  Nonpopulation- 
based  samples  of  OIF  veterans  have  revealed  screening-based 
estimates  of  ITTSD  that  range  from  11.6%  to  12.9%  among  re¬ 
cently  returned  military  personnel  [Hoge  et  al.,  2004:  Va.sterling  et 
al.,  2006),  with  rate.s  increasing  over  time  (Milliken,  AuchteHonie, 
&  Hoge,  2007).  A  combined  sample  of  U,S  service  members 
deployed  to  Iraq  oi'  Afghanistan  demonstrated  screening -based 
PT  SD  rates  of  13.8%  (Schell  &.  Marshall,  2(X)8),  with  new  onset 
rates  of  7.6%  among  combat-exposed  study  participants  (Smith  et 
al.,  2008).  The  prevalence  of  ITSD  among  the  groups  outlined 
above  sugge.sLs  that  if  deleterious  effects  of  PTSD  on  test-taking 
ability  are  found,  a  large  group  of  people  could  be  at  a  signillcant 
disadvantage  in  testing  situations  used  for  pmmoiion  or  college 
admission. 

Several  studies  have  examined  cross-sectional  relationships  be¬ 
tween  chronic  PTSD  and  performance  on  the  types  of  constructs 
measured  in  standardized  as.sessnienis  for  college  admi.ssions. 
Finding  that  IQ  scores  are  inversely  related  lo  PTSD  symptom 
severity  (Brandes  et  aL,  2(K)2;  Gil,  Caiev,  Greenberg,  Kugelmass, 
&  Lerer,  1990;  Gilbertson,  Gurvits,  Lasko,  Orr,  &  Pitman,  2001; 
Gurvfts  et  al.,  2000,  1993;  Vasterling,  Brailey,  Comstans,  Borges, 
&  Suiker,  1997:  Vasterling  et  al.,  2002).  In  particular,  Brandes  et 
al.  (2002)  and  Vasterling  et  al,  (2002)  found  Pearson  correlations 
of  approximately  —.30  between  measiire.s  of  PTSD  symptoms  and 
measures  of  intelligence,  in  a  study  relying  in  part  on  archival 


military  data,  Macklin  et  al.  (1998)  likewise  found  that  current 
intellectual  perfomiance  was  inversely  related  to  PFSD  symptom 
severity  with  a  partial  correlation  of  —.37.  However,  cross- 
sectional  associations  between  postcombat  measures  of  current 
intelligence  and  IT’SD  symptom  severity  were  no  longei  signifi¬ 
cant  after  controlling  for  precombat  imclligence  estimated  from 
archival  records,  suggesting  that  precombai  intelligence  may  have 
created  additional  risk  of  PTSD,  rather  than  PFSD  affecting  intcF 
Icctual  perfojinance. 

The  literature  examining  the  relation.sliips  between  exposure  to 
violence  (a  common  predictor  of  PTSD)  and  the  academic 
achievement  of  adolescem.s  and  adults  also  suggests  an  association 
between  traumatic  experiences  and  standardized  test  performance, 
f'or  example,  Schw  ab-Stone  et  al.  ( 1995)  documented  a  significant 
negative  relationship  between  direct  expo.sure  to  violence  and 
school  achievement  in  a  sample  of  over  2,000  adolescents  in  an 
urban  community.  A  similar  study  (Schwartz  &  Gorman,  2003) 
found  a  negative  relationship  between  exposure  to  community 
violence  and  academic  functioning  a.s  measured  by  a  standardized 
test  of  achievement  and  grade  point  average.  Examining  associa¬ 
tions  specifically  between  PTSD  and  achievement  in  Lebanese 
adolescents  exposed  to  frequent  occunences  of  violence  such  as 
terrorist  attacks  and  artilleiy^  fire,  Saigh,  Mrouch,  and  Bremner 
(1997)  found  that  those  adolescents  with  PTSD,  compared  with 
those  without  IT^SD,  had  lower  levels  of  scholastic  achievement 
on  ihe  Metropolitan  Achievement  Test,  a  standardized  index  of 
academic  achievement  in  the  areas  of  reading,  mathematics,  and 
language. 

Missing  from  the  literature,  however,  is  prospective  research 
allowing  greater  inferences  regarding  the  potential  causal  pathway 
between  PSS  and  test  taking.  The  current  study  uses  prospectively 
gathered  data  from  the  Neurocognition  Deployment  Health  Study 
(NDHS;  Vasterling  et  aL,  2006)  to  examine  potential  changes  in 
test-taking  ability  as  a  function  of  PSS.  The  study  from  which  the 
data  are  drawn  included  neurocog nitive  and  emotional  assessment 
of  a  cohort  of  1,595  U.S.  Army  soldiers,  many  of  whom  eventually 
deployed  to  Iraq  in  the  support  of  OIF.  Some  of  the  neurocognitive 
tasks  administered  in  this  study  evaluated  procc.sses  similar  to 
those  measured  in  a  standardized  testing  environment.  Relevant 
measures  include  tasks  assessing  logical  reasoning  and  vocabu¬ 
lary —  cognitive  skills  measured  on  standiirdi/ed  tests  such  as  the 
SAT,  ACT,  and  GRE.  Tlie  availability  of  both  pro-  and  postwar- 
zone  neurocognitive  and  PTSD  symptom  data  makes  the  data  ,set 
uniquely  suited  to  examine  the  effects  of  PSS  on  standardized  test 
performance. 

Because  the  current  study  targets  how  the  examinee's  ability  to 
correctly  answer  a  standardized  test  question  is  affected  by  the 
acquired  characteristic  of  PSS,  item  response  theory  (IRT)  models 
were  (it  to  the  data.  In  IRT,  responses  to  items  are  viewed  as 
observable  indicators  of  an  individual's  latent  ability  in  which  all 
examinees  (and  Items)  can  be  placed  on  a  conimon  scale  to  assess 
how  much  of  the  latent  trait  an  examinee  has  and  how  much  of  die 
latent  trait  an  examinee  needs  to  correctly  answer  items  widi  high 
probability.  The  current  analysis  uses  IRT  with  covariales  (Adams, 
Wilson,  &  Wu,  1997;  de  Boeck  &  Wilson,  2004;  Zwinderman, 
1991,  1997)  to  examine  the  relationship  between  PSS  and  test- 
taking  ability  as  evidenced  by  responses  to  two  tasks  that  tap  skills 
similar  to  those  measured  on  standardized  tests.  To  control  for  the 
po.ssibility  that  study  participants  suffered  from  PSS  prior  to  OIF 
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war-zone  exposure*  we  included  an  effect  for  predeploynienL  PSS 
in  the  model. 

The  addition  of  covariates  in  die  traditional  IRT  model  is 
particularly  usefiil  for  flexible  modeling  of  categorical  survey  and 
assessnieni  da  la  and  for  explaining  individual  differences.  AN 
Ihough  IRT  is  typically  limited  to  de.scriptive  uses  (as  outlined 
above),  IRT  wiili  covariates  is  also  useful  for  explanatory  pur¬ 
poses.  Given  this  latter  use,  the  study  can  investigate  a  possible 
relationship  between  responses  to  items  on  a  test  and  other  vari¬ 
ables  related  to  the  item  or  the  examinee  (for  a  detailed  discussion 
of  IRT,  see  de  Boeck  &  Wilson,  2004;  Embreison  &  Reise,  2000). 

Based  on  the  potential  for  PSS  and  associated  cognitive  impair¬ 
ment  to  interfere  with  standardized  tests,  it  was  hypothesized  that 
standardized  test  perfonnance  would  be  negatively  affected  by  the 
acquisition  or  exacerbation  of  PSS.  Specifically,  we  predicted  that 
after  taking  into  account  baseline  standardized  test  scores,  combat 
experience,  and  baseline  levels  of  PSS,  postdeploy  mem  PSS 
would  be  negatively  associated  with  vocabulary  and  reasoning 
test-taking  ability.  Findings  provide  potentially  valuable  informa¬ 
tion  regarding  the  nature  of  the  relationship  between  PSS  and 
verbal  and  logical  reasoning  test  performance. 

Method 

Study  Design  and  Sampling 

Participants  were  drawn  from  the  larger  NDHS  study  sample. 
The  current  study  included  only  those  from  the  larger  cohort  {N  = 
654)  who  (a)  were  active-duty  Army  soldiers,  (b)  deployed  to  the 
Iraq  war  zone  during  the  first  wave  of  NDHS  data  collection,  and 


(c)  completed  predeployment  assessments  (Time  I,  between  April 
and  December  2003)  and  posldeploymenL  tujsessmeni.s  (Time  2, 
betw'een  January  and  May  2005).  In  the  larger  study,  sampling  was 
conducted  at  the  battalion  level,  with  battalions  chosen  to  reflect 
heterogeneous  deployment  experiences  (Vasterling  el  ah,  2(X)6), 
Based  on  power  calculations  and  anticipated  participation  and 
attrition  rates,  a  target  sample  aim  of  S50  deploying  soldiers  was 
selected  for  the  kirger  study.  Participants,  referred  at  random  to  the 
study  by  battalion  commanders,  consented  individually  and  were 
offered  a  way  to  exit  the  study  area  unobser\'ed  if  they  declined  to 
participate.  At  the  individual  level,  exclusion  criteria  included 
pending  separation  from  military  service  or  reassignment  or  phys¬ 
ical  limitations. 

Sample  Characteristics 

Sample  demographic  characteristics  can  be  found  in  Table  1 .  By 
occupational  specialty  our  sample  was  as  follows:  infantry  (n  — 
234),  maimenance  (electronics  and  mechanical;  n  —  152),  com¬ 
munications  and  intelligence  hi  =  l(ll)t  health  care  (n  =  43), 
support  and  administration  (n  =  43),  supply  (n  =  54),  other  (n 
27k  In  assessing  the  occupational  distribuilon  of  the  sample,  it  is 
important  to  consider  that  OIF  has  been  characterized  by  high 
levels  of  combat  exposure,  even  in  traditionally  noncombat  occu¬ 
pational  specializations.  The  proportion  of  participants  experienc¬ 
ing  several  types  of  combat  experiences,  as  measured  by  a  modi¬ 
fied  version  of  the  Combat  Experiences  module  of  the  Deployment 
Risk  and  Resilience  Inventoiy/  (King,  King,  ^  Vogt,  2003),  are 
included  m  Table  2* 


Table  I 

Descripitve  Statistics  for  Sample  Used  in  Current  Study 


Descriptive 

N 

% 

Minimum 

Maximum 

M 

SD 

Age 

654 

17.68 

46.48 

25.03 

5.25 

Gender 

653 

Female 

56 

8.56 

Male 

597 

91.28 

Highest  grade  level  (school) 

653 

8.00 

18*00 

12*46 

1*25 

Years  in  the  Army 

653 

nxwt 

24.00 

3*91 

4.26 

Marital  status 

654 

Single 

305 

46.64 

Married 

297 

45.41 

Di  vor  ccd/scparatcd 

47 

7.19 

L>ve-in  partner 

5 

0.76 

Gender  (%  male) 

600 

91.70 

Race/ethnicity 

654 

African  American 

106 

16,21 

Asian  American 

17 

2.60 

Caucus  ian 

369 

56.42 

Hispanic  American 

96 

14.68 

Other 

Assessment  scores 

66 

10.09 

Time  I  logical  reasoning 

654 

1 

24 

20*83 

3.61 

Time  2  logical  reasoning 

654 

3 

24 

2L30 

3.33 

Time  1  vocabulary 

654 

3 

25 

16.10 

5.09 

Time  2  vocabulary 

PCL-C  score 

654 

3 

25 

16.85 

5*t>4 

7’ime  1 

654 

17 

78 

29.16 

12.51 

Time  2 

654 

17 

80 

32.33 

13.21 

I^U-C  ~  Posttraumatic  Stress  Di.sorder  Checklisl,  Civilian  version* 
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J’ahle  2 

of  Shidy  Fanidpams  Wiih  Corubai  Experience  During  Deploymem 


Combat  experience 

N 

Ever 

SE 

At  least  a  few  times  per  week 

SE 

Went  on  combat  patrols  or  mjs.sjons 

651 

91 

l.t 

61 

1.9 

Encountered  land  or  water  mines  and/or  booby  traps 

647 

60 

1.9 

19 

1.5 

Received  hostile  incoming  lire  from  small  arms,  artillery,  rockets,  mortars,  or  bombs 

649 

98 

0.6 

67 

1.8 

Recejvetl  friendly  incoming  fire  from  small  arms,  artillery,  rockets,  mortars,  or  bombs 

649 

22 

L6 

4 

0.8 

In  a  vehicle  that  was  under  fire 

652 

74 

1.7 

23 

1.7 

Atiacked  by  LeirorisLs  or  civilians 

646 

69 

1.8 

26 

1.7 

Part  of  a  land  or  naval  artillery  unit  that  fired  on  the  enemy 

648 

23 

1.6 

8 

l-I 

Pari  of  an  assault  on  entrenched  or  fortified  positions 

648 

32 

1.8 

6 

0.9 

Took  part  in  an  invasion  that  involved  naval  and/or  land  forces 

645 

29 

1.8 

6 

0.9 

In  a  unit  that  engaged  in  battle  in  which  it  suffered  casualties 

648 

64 

1.9 

8 

l.I 

Witnessed  someone  from  own  unit  or  an  ally  unit  being  seriously  woundtjd  or  killed 

649 

55 

2.0 

4 

0.7 

Witnessed  soldiers  from  enemy  iroops  being  seriously  wounded  or  killed 

650 

61 

1.9 

9 

].l 

Was  wounded  or  injured  in  combat 

650 

14 

1.4 

0 

0.3 

Fired  weafK^n  at  the  enemy 

651 

60 

1.9 

15 

1.4 

Killed  or  thought  killed  someone  in  combat 

649 

44 

2.0 

5 

0.9 

Participated  in  a  support  convoy 

650 

95 

0.9 

37 

1.9 

it  is  notable  that  ihe  majority  (61%)  of  parLidpano?  were  in¬ 
volved  in  combat  palrois  or  missions  at  least  a  few  times  per  week. 
FuitlTer,  of  those  who  were  involved  in  combat  patrols  or  missions 
at  least  a  few  times  per  week,  64%,  “  6,64.  p  =  ,(>1, 

indicated  that  they  also  received  hostile  incoming  tire  from  small 
arms,  artillery,  rackets,  mortars,  or  bombs  at  least  a  few  limes  per 
week.  Regarding  past  deployment  history,  of  those  sampled  for  the 
current  article,  14  had  deployed  at  least  once  to  a  hazardous  area' 
excluding  the  current  deployment  since  2001,  Only  two  partici¬ 
pants  had  deployed  twice  to  a  hazardous  area  since  2001. 

Measures 

Posttraumatic  Stress  Disorder  Checklist,  Civilian  version 
(PCL-C),  The  PCL.-C  (Weathers.  Huska.  &  Keane,  1991)  is  a 
widely  used,  1 7- item  self-report  scale  that  measures  the  severity  of 
FrSD  symptoms.  Respondents  are  asked  to  indicate  how  much 
FrSD  symptoms  have  bothered  the  respondent  on  a  5 -point  Likert 
scale  (froJii  nof  at  all  lo  e.xtre?ncly),  without  reference  to  a  specific 
traumatic  experience.  Possible  scores  on  the  PCL-C  range  from  1 7 
(all  responses  are  not  at  ail)  to  85  (all  responses  are  extremely y 
Items  on  the  PCL-C  are  congruent  with  the  Diagnostic  and  Sta^ 
i  is  t  leal  Manual  of  Mental  Disorders  (4Lh  ed.,  text  rev,;  APA,  2000) 
and  address  each  of  the  three  symptom  clusters.  For  example, 
respondents  are  asked  how  often  '“they  feel  distant  or  cut  off  from 
people”  or  how  often  they  “have  repeated,  disturbing  dreams  of  a 
stressful  military  experience."  Time  )  and  Time  2  Cronbach’s 
alphas  tor  the  PCL-C  are  .93  and  .94,  respectively.  Other  studies 
have  found  the  PCL  to  be  characterized  by  high  lesL-retest  reli¬ 
ability  =  .92  and  .88,  immediate  and  1-week  retest,  respec¬ 
tively),  internal  consistency  (ot  =  .94),  and  convergent  validity 
(rs  >  .75)  with  other  i^SD  measures  (Ruggiero,  Del  Ben,  Scotti, 
&  RabaUiis,  2rK)3),  Further,  the  PCL  was  found  to  correlate  well 
with  the  Clinician- Administered  PTSD  Scale  (r  =  ,93),  and  it  is 
recommended  as  a  good  screening  and  self-report  measure  of 
FfSD  (Blanchard,  Jones -Alexander.  Buckley,  &  Fomeris,  1996). 
In  the  current  sample,  women  scored  about  four  points  higher  on 
Time  I  PCL  than  men,  r(649)  =  2.55,  p  <  i)L  No  gender 
differences  existed  at  Time  2.  Older  participants  had,  on  average. 


lower  PCL  scores  at  Time  L  For  each  5-year  increase  in  age, 
participants  scored  about  one  and  a  half  points  lower  on  the  PCL, 
f(651)  =  3.21,  <  .01.  No  age  differences  existed  at  Time  2, 

Automated  Neuropsychological  Assessment  Metrics 
(A  NAM)  logical  reasoning  a.ssessment.  Tlie  A  NAM  logical 
reasoning  task  (Reeves,  Kane,  Elsmore,  W' in  ter,  &  B  lei  berg,  2002) 
measures  grammatical  and  logical  reasoning.  The  logical  reason¬ 
ing  task  is  taken  from  the  larger  AN  AM  battery,  a  clinical  battery 
originally  designed  by  the  Office  of  Militaiy  Perfonnance  Assess¬ 
ment  Technology  to  measure  cognitive  functioning  across  admin¬ 
istrations  (Kabat,  Kane,  Jefferson,  &  DiPino,  2001).  The  larger 
asses.sment  has  proven  useful  in  a  number  of  clinical  applications 
and  as  a  cost-effective  measure  of  cognitive  function  (Jones,  Loe, 
Krach,  Rager.  &  Jones,  2008).  The  accuracy  measure  of  the  logical 
reasoning  task  has  been  found  to  correlate  exceptionally  well  with 
the  Cognitive  Efficiency  cluster  of  the  Woodcock -John son  Tests 
of  Cognitive  Ability  (Jones  et  al,  2008).  All  24  logical  reasoning 
items  present  both  a  logical  rule  (such  as  df  comes  before  #J  and  a 
logical  relation  (such  as  in  which  the  examinee  chooses 

whether  the  relation  is  the  same  as  or  different  from  the  rule.  (In 
the  previous  example  the  correct  answer  is  same;  that  i.s,  ^  does 
come  before  as  stated  in  the  rule.)  In  this  sample,  the  reliabilities 
were  Cl  =  .84  for  Time  I  and  a  =  .82  for  Time  2.  No  age  or  gender 
differences  existed  on  this  mea.su re  at  either  lime  point. 

NES3  vocabulary  as^vessineiit  Tlie  NES3  vocabulary  task  is 
a  computer- administered  25 -item  multiple-choice  test  designed  to 
estimate  general  verbal  ability  (Letz,  2000)  and  is  derived  in  part 
from  the  Armed  Forces  Qualification  Test-Verbal  sub  test.  The 
larger  NHS3  assessment  is  designed  to  assess  neiirobehavioral 
function  in  studies  of  environment  and  occupational  health.  The 
NES3  vocabulary  task  correlates  well  with  the  Wechsler  Adult 
intelligence  Scale-Revised  vocabulary  test  (Kxengel  et  aL,  1996). 
In  this  sample,  the  reliabilities  were  a  —  .87  for  both  Times  i  and 
2.  No  gender  differences  existed  on  this  measure  at  Time  1  or  Time 
2.  Older  paiticipants,  on  average,  scored  higher  on  the  vocabulary 


'  For  the  current  article,  luizardous  area  is  defined  as  Afghani.slan,  Iraq, 
Bosnia,  Kosovo,  or  Kuwait. 
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task.  A  5 ’■year  age  increaiie  equated  lt>  appro xiinately  a  one-poini 
increase  in  a  parti ci pants'  vocabulary  score  at  Time  1,  /(648)  = 
5.91,  <  .01,  and  approximately  a  one-ptJint  increase  at  Time  2, 

/(651)  =  5.22, /j  <01. 

Comhal  experiences.  Combat  exposure  was  measured  with 
the  Combat  Experiences  scale  from  a  modified  version  of  the 
Deployment  Risk  cind  Resilience  Inventory  (King  et  al.,  2003).  The 
Combal  Experiences  scale  is  a  15-item,  live-category  Likert  scale. 
Response  options  range  from  the  experience  never  happened  to  the 
experience  happened  daily  or  almost  dally.  Higher  .sum  scores  on 
this  scale  are  indicative  of  greater  combat  exposure.  A  complete 
list  of  these  items  can  be  found  in  Table  2.  In  this  sample,  internal 
consistency  was  high  (ot  = 

Other  covariates.  To  control  for  other  person  and  contextual 
factors  that  may  also  be  responsible  for  changes  in  test-taking 
ability,  we  added  a  number  of  co variates  to  the  model.  These 
include  age,  gender,  average  number  of  htjurs  of  .sleep  for  the  week 
prior  lo  Time  2  data  collection  assessment  and  average  weekly 
alcohol  cousumplion  (in  number  of  drinks)  for  the  month  prior  to 
the  Time  2  data  collection.  Finally,  given  the  association  between 
PTSD  and  traumatic  brain  injury  (OR  =  2.98,  95%  Cl  [1.7(f  5.24]; 
Hoge  et  al.,  20()8),  we  included  as  a  predictor  whether  or  not  the 
respondent  reported  a  head  injury  resulting  in  loss  of  conscious¬ 
ness  between  the  pre-  and  postdeployment  data  collections. 

Analysis  Method 

In  the  current  study,  we  used  a  latent  regression  Rasch  model 
(Adams  et  al.,  1997;  de  Boeck  &  Wilson,  20()4;  Zwindennan, 
1991,  1997)  that  included  attribuies  of  the  person  lo  explain 
individual  differences.  This  method  permitted  the  addition  of  co¬ 
variates  in  IRT  models.  The  laieni  regression  RliscIi  model  is  a 
type  of  multilevel  IRT  model  that  has  been  shown  to  have  utility 
in  analyzing  item  response  data  when  explaining  individual  dif¬ 
ferences  is  of  interest  (Cheong  &  Raudenbush,  2000;  Pastor, 
2003).  The  power  of  the  latent  regression  Rasch  model  is  in  the 
addition  of  predictors  that  allow  for  a  flexible  exploration  of 
individual  differences  with  respect  to  (latent)  ability,  which  stan¬ 
dardized  te.sts  are  presumed  to  measure.  Specifically,  adding  co- 
variates  for  PSS  and  a  number  of  control  variables  into  the  Rasch 
model  allowed  an  examination  of  possible  associations  between 
PSS  and  an  examinee’s  test-taking  ability. 

The  kuent  regression  Rasch  model  is  an  exten.sion  of  a  standard 
Rasch  model  (Rasch,  1 980)  with  the  addition  of  linear  predictors 
for  die  persotfs  value  on  the  latent  trait.  The  model  for  the  latent 
trait,  0^,  is  a  linear  regression  equation;  that  is, 

j 

;■  ] 

where  is  the  value  of  covariate  j  ij  ^  1 , , . . , for  person  p  and 
7.  is  the  regression  coefiicient  of  covainalcy.  This  model  includes 
a  random-person  effect  e^,  which  repre.sents  unexplained  variabil¬ 
ity  between  people  in  terms  of  their  ability.  In  line  with  typical  IRT 
conventions,  the  latent  trait  is  standardized  to  a  mean  of  zero  and 
a  standard  deviation  of  one.  The  item  difficulty  parameters  in  the 
latent  regression  Rasch  model  (3  ■  are  identical  to  the  item  difficulty 
parameters  in  the  classical  Rasch  model.  Using  this  approach 
yields  the  fcltowitig  model  for  resfxmses  to  items: 


PiY,„  =  1  |0,.  p,)  =  - (2) 

I  +  exp|^ 

where  =  1 10^,  P,)  is  the  probability  that  a  person  with  ability 
0^,  gives  a  correct  response  on  item  /  wiih  difficuliy  p,. 

All  IRT  models  were  fit  to  data  with  PROC  NLMIXED  in  SAS 
9. 1  (SAS  Institute,  2003;  for  examples  of  input  code,  see  de  Boeck 
&  Wilson,  2004;  Sheu,  Chen,  Su,  &  Wang,  2005),  Likelihood  ratio 
tests  were  used  to  test  the  significance  of  the  effect  of  PSS  (for  a 
discussion  of  likelihood  ratio  tests,  see  Agresti,  2(K)7). 

IVlodeks 

To  investigate  whether  changes  in  PSS  during  deployment 
significantly  predicted  differences  in  examinees’  ability  to  cor¬ 
rectly  answer  test  items,  we  fit  several  IRT  models  to  the  data. 
All  models  were  fit  twice:  once  each  for  the  Time  2  logical 
reasoning  and  the  vocabulary  item  responses.  To  create  a  mea¬ 
sure  of  residualized  change  taking  into  account  Time  I  values 
of  PSS  and  cognitive  task  scores,  in  every  model,  we  included 
Time  1  PSS  and  the  Time  1  (pre deployment)  value  of  the 
relevant  cognitive  task  score  as  predictors.  In  addition,  we 
included  a  number  of  covariate.s  to  control  for  other  factors  that 
may  contribute  to  ability  differences.  The  covariates,  taken 
from  Time  2  measurements,  included  combal  exposure,  gender, 
age,  alcohol  consumption,  sleep,  and  head  injury  with  loss  of 
consciousness.  Time  2  PSS  measures  entered  into  the  model 
either  as  tlie  total  PCL  score  or  by  symptom  cluster.  Given  the 
high  level  of  coliinearity  betw'een  symptom  clusters,  subscale 
scores  were  not  entered  into  a  single  model.  Rather,  for  those 
models  that  examined  the  effect  of  symptom  cluster  on  test¬ 
taking  ability,  each  symptom  cluster  score  w'as  used  .separately 
as  a  Time  2  predictor.  This  approach  resulted  in  four  mcKlels 
each  for  the  logical  reasoning  and  the  vocabulary  items.  The 
effects  eslimaied  in  the  vocabulary  and  logical  reasoning  mod¬ 
els  are  detailed  in  Table  3. 


Table  3 

Summary  of  Predictors  for  Each  Model 


Effect 

Model  1 

Mi>del  2 

Model  3 

Model  4 

PSS^ 

X 

X 

X 

X 

PSS’’ 

X 

Reexperieucing'^ 

X 

Avoidance/numbing^ 

X 

Hyperarousal^ 

X 

Head  Injury'" 

X 

X 

X 

X 

Combal  experience'^' 

K 

X 

X 

X 

Age'’ 

X 

X 

X 

X 

Gender'* 

X 

X. 

X 

X 

Sleep'' 

X 

X 

X 

X 

AlcohoF 

X 

X 

X 

X 

Cognitive  assessment  score"' 

X 

X 

X 

X 

Note.  Effects  are  identical  for  the  logical  rea.son]ng  and  vocabulary 
mudels.  PSS  =  ptj.sltniumalic  .stress  sympEOms. 

®  Measured  at  Time  U  ^  Meirsured  at  Time  2. 
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The  model  fit  to  the  NDHS  data  was 

=  i|0„  p;) 

exp  +  y^PSSl^,  -  +  e,  -  P,- 

\  j=-^  I 


1  T  t  yif^SSl^,  +  "  P/j 

where  7^  was  the  coefficient  for  the  effect  of  PSS  beftsre  deploy¬ 
ment,  72  CDefticieni  for  the  effect  of  PSS  or  symptom 

cluster  after  deployment,  7-  were  the  other  Co  variates  as  listed  in 
the  Measures  section,  and  was  the  item  difficulty* 

Significant  pmameter  estimates  for  Time  2  PSS  (i,e.,  7,)  suggest 
that  as  an  individual's  PCL  score  or  symptom  cluster  score 
changes,  the  probability  of  correctly  itnswering  an  item  changes 
according  to  the  level  of  symptom  severity.  In  other  words,  a 
signillcani  negative  effect  for  Time  2  PSS  suggests  that  this 
disorder  reduces  test-taking  ability.  Given  the  prospective  design 
of  the  study,  we  can  reasonably  attribute  this  reduction  to  changes 
in  PSS, 

Results 

The  parameters  from  the  models  fit  to  the  logical  reasoning  daca 
and  the  vocabulary  data  are  presented  in  Table  4. 


The  logical  reasoning  models  all  yielded  significant  likelihood 
ratio  test  statistics  for  PSS  symptoms  at  Time  2:  PSS,  1  -  S4, 
p  <  .01;  reexperiencing,  U  =  19,  /?  <  ,0 1 :  avoidance-numbing, 
X^(l)  =  18,  /7  <  ,01;  hyperarousal,  x"fO  ^  85,  p  <  .01.  That  is, 
for  each  model,  the  addition  of  PCL  scores  or  subscale  scores  at 
Time  2  significantly  increased  the  fit  of  the  relevant  model  when 
compared  with  a  model  that  did  not  contain  {he  effect  for  Time  2 
PCL  scores.  For  the  logical  reasoning  model  in  which  Time  2  PCL 
total  score  wa,s  entered  into  the  model,  the  effect  of  PSS  at  Time 
2  was  significant  when  controlling  for  PSS  at  Time  1  and  the  other 
CO  variates,  -y,  ^  —.01.  t(636)  =  "5,15,  />  <  ,0L  Further,  with  the 
exception  of  Time  1  logical  reasoning  performance.  Time  2  PSS 
symptoms  was  the  only  significant  effect  in  the  model.  In  other 
w^ords,  none  of  the  other  covariates  that  might  be  associated  with 
diminished  ability  were  significanlly  associated  with  correctly 
answering  logical  reasoning  items. 

On  average,  study  participants  reported  a  Time  2  PCL  score  of 
32.33  (SD  -  13.21).  This  suggests  that  when  holding  all  other 
CO  variates  constant,  the  probability  of  correctly  answering  the 
average  logical  reasoning  item  for  someone  with  a  PCI.  .score  of  32 
is  approximately  .61,  In  compari.son,  the  probability  of  a  correct 
answer  for  a  person  with  a  PCL  score  of  17  (the  lowest  possible 
score)  is  .64.  This  suggests  an  approximately  3%  average  reduc¬ 
tion  in  the  probability  of  correctly  answering  the  average  logical 
reasoning  item  at  Time  2  for  a  study  participant  with  an  average 
PCL  score.  Although  the  average  effect  was  small,  at  the  extreme 
end  of  the  range,  the  effect  was  much  larger.  For  example,  a  person 


Table  4 

Summary  of  Mod^'L'i 


Mtxicl  I  Model  2  Model  3  Model  4 


Model  parameter 

Estimiile 

SE 

df 

r 

Estimate 

SE 

df 

i 

Estimate 

5^* 

df 

l 

Estimate 

SE 

4f 

1 

Logical  reasoning 

PSS^^ 

.00 

.00 

636 

023 

,00 

.00 

635 

-0.12 

,00 

,00 

635 

-0.18 

,(X) 

.00 

634 

0.01 

PSS^ 

-.or 

*Q1 

636 

-5*15 

Reexperi  dicing^ 

-.01 

.03 

635 

-1.63 

Avoidanee/r 

mmbing" 

-.01 

.01 

635 

-L41 

Hyperarous£ 

iF 

-.01 

.01 

634 

-1.72 

Read  injury' 

.00 

.01 

636 

0-17 

,00 

,01 

635 

0,09 

.00 

.01 

635 

0*05 

.00 

*01 

634 

0.13 

Combat  experience" 

.02 

Jl 

636 

-0,22 

.02 

.11 

635 

0.16 

.00 

.11 

635 

om 

-.03 

.11 

634 

-0.26 

Agc*^ 

.03 

.02 

636 

1*45 

.03 

.02 

635 

1.44 

.03 

.02 

635 

1.46 

.03 

*02 

634 

1.38 

Gender^ 

.00 

.00 

636 

1.48 

.00 

.00 

635 

1.46 

.00 

.00 

635 

1.41 

.00 

.00 

634 

1.48 

Sleeps 

.05 

.11 

636 

0.49 

,05 

.1 J 

635 

0.46 

.04 

.11 

635 

0,40 

.06 

,11 

634 

0.49 

Alcohol'* 

.00 

.00 

636 

-1,63 

,00 

.00 

535 

“L62 

-.01 

.00 

635 

-1.80 

-.01 

.fK) 

634 

-1.84 

Lxigical  reasoning'' 

.05' 

.00 

636 

27,96 

*05" 

.00 

635 

27.49 

,05“ 

.00 

635 

27.38 

.05' 

.00 

634 

27.55 

Vocabulary 

PSS" 

.00 

.00 

637 

0.30 

.00 

.00 

636 

-0*39 

.00 

.00 

636 

-0.02 

.00 

.00 

635 

-0.10 

PSS'’ 

-.or 

,00 

637 

-5.50 

Reexpcricncing'* 

-,01 

,00 

636 

-1.75 

Avoid  an  ce/ n  u  mbi 

-.01 

,00 

636 

-2.50 

HyperartJUSc 

if' 

-*0l 

m 

635 

-1.97 

Head  injury' 

.01 

.00 

637 

1.7S 

*01 

,00 

636 

1*68 

.01 

*00 

636 

L73 

.01 

.00 

635 

1.67 

Combat  experience" 

.03 

.07 

637 

0*38 

.01 

.07 

636 

0.08 

.00 

,07 

636 

-0,01 

,02 

*07 

635 

0.26 

Age^ 

.00 

.01 

637 

0*17 

.01 

.01 

636 

0.49 

.01 

.01 

636 

0*41 

.00 

.01 

635 

0.23 

Gender^ 

.00 

,00 

637 

-0.25 

.00 

.00 

636 

-0.35 

.00 

.00 

636 

-0.33 

.00 

.00 

635 

-0.32 

Sleep’^ 

.01 

.07 

637 

OJO 

.00 

.07 

636 

0.07 

.00 

.07 

636 

0.00 

.01 

.07 

635 

0.08 

Alcohol'* 

Of) 

637 

-1.45 

.00 

,00 

636 

-1.76 

.00 

.00 

636 

-1.72 

,00 

.00 

635 

-1.69 

Vocabulary 

22" 

.00 

637 

46.51 

.22' 

.00 

636 

46.29 

22' 

,00 

636 

46.33 

,22' 

.00 

635 

46,44 

A'ofc*  PSS  =  posrtraumatic  .stress  symptoms. 

"  Measured  at  Time  1 .  Measured  at  Time  2* 
’■/j  <  .05  (Bonferrom  adjusted  <  0.013)* 
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with  a  Time  2  PCL  score  of  71  (ihe  maximum  score  among  NDHS 
parLicipanLs)  would  have  an  11%  lower  probability  of  correctly 
answering  the  most  difficult  logical  reasoning  than  someone  with 
the  lowest  PCL  score,  Lvcn  more  pronounced  were  the  differences 
in  correct  response  probabilities  on  a  logical  reasoning  item  of 
average  difficulty.  For  an  item  of  this  type,  the  difference  in  the 
probability  of  a  correct  answer  was  more  than  13%  between  those 
with  the  lowest  Time  2  PSS  levels  (PCL  =  17)  and  the  highest 
Time  2  PSS  levels  (PCL  =  71). 

Figure  I  displays  item  characteristic  curves  for  a  logical  rea¬ 
soning  item  of  average  difficulty  and  participant  groups  with  the 
lowest  versus  the  highest  Time  2  PCL  scores.  The  gray  curve 
represents  correci  response  probabilities  For  participants  with  the 
lowest  observed  Time  2  PCL  scores,  and  the  black  curve  denotes 
correct  response  probabilities  for  participants  with  the  highest 
observed  Time  2  PCL  scores.  Here  we  can  see  that  regardless  of 
ability  level,  the  probability  of  correctly  answering  a  typicaJ  log¬ 
ical  reasoning  item  is  lower  for  the  group  widi  the  highest  level  of 
PSS  at  Time  2.  Only  seven  participants  in  the  sample  were  at  this 
pathological  extreme,  whereas  28  participants  reported  a  Time  2 
PCL  score  of  60  points  or  more. 

P'or  the  logical  rea.soning  models  in  which  PCL  symptom  cluster 
scores  were  entered  into  the  model  the  effects  of  the  symptom 
cluster  scores  measured  at  Time  2  were  not  significant:^  reexpe- 
riencing,  %  “  “.01,  d635)  -  —  L63,  p  —  .10;  avoidance- 
numbing,  7^  =  —ill,  ^(635)  =  — 1.41,  /?  =  .16;  hyperarousal, 
—.01,  /{634)  =  — 1.72,  p  =  .09.  Indeed,  besides  the  effect  of 
Time  1  cognitive  asses.sment  scores,  there  were  no  significant 
effects  in  the  subscale  models  for  the  logical  reasoning  items.  Thi  s 
suggests  that  no  single  Time  2  PSS  cluster  w'as  responsible  for 
differences  in  logical  reasoning  test-taking  ability.  Rather,  findings 
suggest  that  the  full  spectrum  of  P'fSD  symptom.s  was  responsible 
for  logical  reasoning  ability  differences. 

As  with  the  logical  reasoning  models,  the  vtK^abulary  models 
also  yielded  significant  likelihood  ratio  test  suui sties  for  PSS 
symptoms  at  Time  2:  PSS,  x^(  L)  =  93,  p  <  .01;  reexperiencing, 
=  52,;?  <  ,01;  avoidance^numbing,  x^(0  ^  P  ^  ^^li 
hyperarousal,  x^(l)  =  130,  p  <  ,0L  The  results  suggest  that  in 
each  vocabulary  modeL  the  fit  was  significantly  improved  by 
adding  an  effect  for  Time  2  PSS.  In  teims  of  significant  effects  in 
the  models  fit  to  the  data,  the  findings  were  simitar  to  the  logical 
reasoning  models.  That  is,  the  vocabulary  model  that  included 
Time  2  PCL  scores  exhibited  a  significant  Tin\e  2  PSS  effect  when 
controlling  for  the  other  effects  in  the  model,  %  =  -,0L  rt647)  = 
—5.00,  p  <  .01.  Besides  the  Time  I  vocabulary  assessment  score, 
none  of  the  other  predictors  in  the  model  were  significant. 

ITie  Time  2  PSS  effect  can  be  interpreted  such  that  given  an 
average  respondent  with  a  Time  2  PCL  score  of  32  (the  average 
PCL  score  in  the  sample),  we  expect  that  the  probability  of  a 
correct  answer  on  an  average  vocabulary  item  is  .55.  Compared 
with  that  of  a  respondent  whose  Time  2  PCL  score  is  just  17  and 
an  associated  probability  of  a  correct  answer  at  about  .58,  there  is 
a  3%  higher  probability  of  an  incorrect  airswer  from  the  respondent 
with  a  higher  Time  2  PCL  score,  all  el.se  equal. 

Vocabulary  te.sl-taking  ability  differences  were  more  pro¬ 
nounced  when  comparing  study  participants  at  the  highest  and 
lowest  end  of  the  Tinte  2  PCL  spectrum  on  the  hardest  items.  For 
example,  a  respondent  with  the  highest  Time  2  PCL  score  would 
correctly  ansvver  the  most  difficult  vocabulary  item  about  6%  of 


the  time.  In  comparison,  a  respondent  with  the  lowest  Time  2  PCL 
score  would  answer  the  same  item  correctly  atx^ut  1 1%  of  the  time, 
for  a  difference  of  5%.  However,  tlie  largest  disparity  between  study 
piiriici pants  at  the  high  and  low  end  of  the  Time  2  PSS  spectrum  was 
on  items  of  average  difficulty,  where  differences  in  the  probability  of 
correct  answers  emerged  on  the  order  of  more  than  ,13,  PiYp  — 
j|0f/;CT;ra^  P)  =  .45  versus  P(Y,.^  =  P)  =  .5&.  Tfle 

differences  in  probabilities  of  a  correct  response  to  an  average  vtxab- 
ulary  item  are  presented  in  Figure  2.  Although  on  average  the  diffi¬ 
culty  for  vocabulary  items  w'as  higher  than  for  logical  reasoning 
items,  we  found  similar  results  on  the  vcxabulaiy  assessmeni  between 
those  with  the  highest  PCL  scores  and  diose  with  the  lowest  PCL 
scores.  That  is,  across  Uie  ability  continuum,  those  wnth  the  highest 
levels  of  PSS  had  lower  probabilities  of  a  correci  response. 

Similar  to  the  logical  reasoning  mtsdels,  the  vocabulary  models 
in  which  Time  2  PCL  symptt>m  cluster  scores  were  entered  into  the 
model  did  not  show  significant  Time  2  PSS  effects  with  an 
adjiLstcd  significance  level:  reexperiencing,  =  —  ,01,  K636)  = 
—  ]  .75,  p  =  ,08;  avoidance-numbing,  7^  =  —  .01,  r(636)  =  -2.50, 
p  =  ,013:  hyperarousal,  7^  =  -.01,  d635)  =  -L97,  p  =  ,05. 
Again,  there  were  no  significant  effect.s  in  the  subscale  models  for 
the  vocabulary  items  other  than  the  Time  1  cognitive  asses.sment 
scale.  These  findings  substanfiaie  the  results  froin  the  logical 
reasoning  models.  Thai  is,  individual  symptom  clusters  were  not 
sufficient  to  diminish  test-taking  ability.  Instead,  Time  2  PSS,  as 
measured  by  all  three  symptom  clu.sters,  seems  to  be  an  important 
determinant  of  vocabulary  test-taking  ability  following  exposure  to 
an  extreme  traujiiaiic  stressor. 

Discussion 

In  this  article,  we  used  IRT  with  covariates  to  assess  whether 
changes  in  F*TSD  symptomatology  had  a  significant  effect  on 
test-taking  ability  on  two  cognitive  tasks  administered  after  expo¬ 
sure  to  wanime  stressors  that  measure  constructs  similar  to  th<jse 
assessed  on  standardized  tests.  Findings  indicated  that  for  both  the 
logical  reasoning  task  and  the  vocabulary  task,  a  residual ized 
measure  of  Time  2  PSS  adjusted  for  Time  1  PSS  values  was 
significantly  associated  with  diminished  ability  to  answer  items 
correctly,  especially  for  participants  who  showed  the  largest  in¬ 
crease  in  PSS  at  Time  2.  At  the  extreme,  people  with  the  highest 
levels  of  Time  2  PSS  would  face  a  13%  reduction  in  the  proba¬ 
bility  of  correctly  answering  a  typical  logical  reasoning  item  or 
vocabulajy'  item  when  compared  with  those  with  the  lowest  Time 
2  PSS  levels, 

Pievious  research  on  college-age  groups  sugge.st.s  that  educa¬ 
tional  attainment  is  negatively  impacted  by  anxiety  disorders 
(Kessler,  Foster,  Saunders,  Sc  Stang,  1995);  however,  less  is 
known  about  the  specific  effects  of  anxiety  disorders  on  Le.st- taking 
ability,  particularly  from  a  prospective  approach.  The  current  study 
sheds  light  on  this  issue  and  suggests  that  after  controlling  for 
predeployment  PSS  and  a  number  of  po.ssibly  confounding  factors, 
PTSD  symptoms  adversely  affect  test-taking  ability  in  study  par¬ 
ticipants,  and  that  there  is  a  dosing  effect  in  which  more  severe 
symptoms  are  associated  with  poorer  te.si  taking. 


"Given  that  four  models  were  fit  to  the  data,  wc  used  a  Bonferroni 
adjusted  significance  level,  cc/C  =  .05/4  ■=  .013,  where  C  is  equal  to  the 
number  of  hypotheses  tested. 
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Logical  reasoning  iiem  of  average  difficuJly  (beta  =  -0,75) 


/.  iLem  chiiraeteri^tie  t;urve&  for  a  logical  reasoning  item  of  average  difllcnlty  ajid  two  groups  with  the 
highest  and  lowest  levels  of  posttrauniatic  stness  symptoms,  PCL  =  PosEtranmatie  Stress  Disorder  Checklist. 


To  interpret  the  possible  effect  of  the  highest  level  of  PSS  on 
loiaLl  test  scores,  we  simitlated  item  response  data  for  2,000  exatn- 
inees  of  average  ability,  1,0CK)  each  in  the  low^  and  high-FSS 
groups,  corresponding  to  TCL  scores  of  17  and  71,  respectively. 
Using  item  dillicullies  calculated  in  earlier  analyses,  latent  trait 
values  of  zero  Ibr  those  with  the  lowest  levels  of  PSS  and  adjusted 
latent  trait  values  of  —.71  for  those  with  the  highest  levels  of  PSS, 
we  generated  item  responses  using  our  IRT  model  for  ail  2,000 
examinees  on  both  cognitive  tasks.  This  resulted  in  item  responses 
for  2,000  examinees  on  24  items  for  the  logical  reasoning  as s ess- 
men  i  and  25  items  for  the  vocabulary  assessment.  On  the  basis  uf 
this  method,  we  found  that  those  at  the  lowest  end  of  the  PSS 
spectrum  received  an  average  score  of  16,07  on  the  logical  rea- 


sojiiitg  assessment,  whereas  those  in  the  high-PSS  group  received 
a  significantly  lower  average  score  of  11,80,  ”  4,27, 

=  —4.20,  p  <  .01.  Findings  were  similar  for  the  vocabulary 
assessment,  for  which  simulated  data  resulted  in  a  low-PSS  group 
mean  of  17,74  and,  again,  a  significantly  lower  mean  for  the 
high-PSS  group  of  12.69,  -  5,05,  /(998)  =  -5.06,  p  <  .01 . 

Score  differences  on  both  cognitive  assessments  between  those  in 
the  low-  and  high-PSS  groups  also  suggest  meaningful  practical 
differences  as  indicated  by  large  Cohen's  d  effect  sizes  (logical 
reasoning,  d  —  1,66^  vocabulary,  d  =  1,90),  These  findings  indi¬ 
cate  that  widespread  test-taking  ability  differences  stemming  from 
E^SS  can  have  important  consequences  on  cognitive  assessment 
scores. 


Vocabulary  item,  of  average  difficulty  (beta  =  -0.51) 


- Eiiaiiiini^e  gmup  wit]i  PCI.  score  73  “ —  EMtnifiee  wiih  PCL  score  =  J7 

Fif^ure  2.  Ileni  characteristic  curves  fora  vocabulary  item  of  average  difficulty  and  two  groups  with  the  highest 
and  k>we.st  levels  of  posttrau malic  stress  symploms,  PCL  =  Posttrau malic  Stress  Disorder  Checklist. 
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Given  ihe  significant  effect  that  Time  2  PSS  has  on  an  exam- 
incc's  ability  to  correctly  answer  the  two  cognitive  tasks  used  in 
this  study,  it  is  reasonable  to  expect  Uiat  these  findings  may  be 
relevant  in  other  contexts.  As  of  the  end  of  2007,  more  than  1 .64 
million  service  members  have  deployed  in  support  of  the  wars  in 
Iraq  and  Afghanistan,  with  some  units  serving  multiple  rotations  of 
12  to  15  months  (Tanielian  &  Jayeox,  200ii).  Et  is  important  to 
consider  that  many  of  these  miliuiry  servicewon^en  and  -men  wit! 
puoiue  higher  education  or  otherwise  face  testing  situations  for 
promotion  or  job  placemenl,  Hsli  mates  suggest  that  Montgomery 
G1  Bill  usage  rates  exceed  65%  (Winter,  2(X)5),  Understanding 
how  veterans’  experiences  impact  on  their  ability  to  pursue  higher 
education  or  career  advancement  is  important  for  both  the  mental 
health  and  the  education  communities. 

If  indeed  this  article’s  findings  do  generalise  to  a  civilian 
population,  the  implications  for  this  research  may  be  far-reaching, 
Estimates  suggest  that  over  1 25  JXX)  children  in  New  Orleans  were 
displaced  as  a  result  of  Hurricane  Katrina  (Redlener,  2006),  and 
nearly  one  half  of  children  in  shelters  exhibit  some  type  of  emo¬ 
tional  or  behavioral  disorder  such  as  n  SD  (Abramson  &  Garfield, 
2tHXfi,  Internationally,  PI’SD  rales  amtmg  children  are  estimated  at 
10%  in  Baghdad  (Ecclesion,  2007)  and  nearly  33%  in  Mosul,  Iraq 
(Eccleston,  2007),  and  13%  in  posUsunami  southern  Thailand 
(Thienkrua  et  aL,  2CX>6).  Iniemaiionaily,  areas  such  as  these  are 
rocipienLs  of  recovery  aid  frojn  iniemalional  organizations  that 
commonly  mandate  adherence  to  structural  adjusmieni  programs, 
a  conifxment  of  wliich  may  include  stundardixed  tests  of  achieve¬ 
ment  as  markers  of  sufficient  prtigress.  Our  findings  suggest  that  in 
this  context,  aciiievement  results  fram  standardized  cognitive  as¬ 
sessments  should  be  used  with  caution,  if  at  all.  Alternatively,  test 
administrations  in  known  conflict  or  disaster  areas  should  include 
a  in'SD  scale  so  ihai  proficiency  score  estimates  can  be  adjusted 
accordingly.  Either  empirical  estimates  of  the  PTSD  effect  can  be 
used  iir  additional  studies  regarding  the  magnitude  of  the  PTSD 
effect  on  ability  could  be  undertaken, 

'Hiere  are  several  liTniiaiions  associated  with  the  study,  FirsL 
although  the  median  age  (23.5  years)  of  the  cohort  is  fairly 
representative  of  the  median  age  (20.5  years)  of  l)S.  college 
.students  (National  Center  lor  Education  Statistics,  2005),  the  pro¬ 
portion  of  women  (8%)  is  not  representative.  Furthermore,  sys¬ 
tematic  rather  than  population-based  sampling  was  used  to  derive 
the  study  sample  and  included  only  one  service  branch,  which 
limits  the  generalizability  of  the  findings  to  a  broader  population. 
Although  27%  of  the  study  participants  reported  education  levels 
beyond  high  school,  participants  may  differ  systematically  from 
young  adulis  who  choose  college  over  military^  service.  Further¬ 
more,  cognitive  assessments  used  for  the  current  analysis  were 
drawn  from  those  collected  during  the  NDHS,  and  they  do  not 
rep  re  sen!  the  exact  types  of  items  found  on  standardized  college 
entrance  tests  such  as  the  SAT  and  the  GRE.  However,  the  cog¬ 
nitive  processes  examined  closely  match  many  of  iliose  measured 
on  standardized  assessments.  Also,  we  did  not  asseuss  clinical 
RfSD  diagnoses. 

Regarding  the  tasks  used  in  this  atialysis.  the  items  may  have 
been  Uxi  easy  to  fully  detect  differences  associated  with  PSS.  Item 
difficulties  \vere  in  general  quite  low  and  ranged  from  approxi¬ 
mately  —3,91  to  1.97  at  Time  2.  Similafly,  the  probability  that  an 
average  examinee  would  correctly  answer  an  average  item  ranged 
from  a  iow^  of  approximalcly  .71  lo  nearly  .96,  depending  on  the 


cognitive  assessment.  As  a  result,  findings  may  underestimate  the 
impact  of  PSS  on  test-taking  ability;  however,  the  characteristics 
of  the  logical  reasoning  and  vocabulary  tasks  do  allow  some 
additional  insights  into  the  effect  of  PSS  on  test-taking  ability. 
Thai  is,  the  relative  ease  of  the  task  and  low- stakes  nature  of  the 
lesiing  context  suggest  that  proces.se s  (Xher  than  simple  test  anx¬ 
iety  explain  associatiomi  between  ic.si  performance  and  PSS  (Hem¬ 
bree,  1988). 

The  findings  from  this  study  neverthekss  provide  evidence  of 
the  potential  detrimental  effect  of  PSS  on  standardized  lest  per- 
fonnance.  Given  the  unique  longitudinal  design  of  ilie  NDHS,  w'c 
had  the  opportunity  to  consider  the  baseline  siains  of  individuals 
who  were  eventually  exposed  to  irauniatic  stressors,  allowing  for 
stronger  causal  inferences  than  those  typically  pennitled  within 
crass- sectional  designs.  Additional  replication  studies  that  include 
repre.seniative  samples  that  are  administered  standardized  college 
entrance  tests  as  well  as  a  clinical  asscssmem  of  PTSD  will  allow 
findings  to  be  applied  to  a  broader  population  of  college  appli¬ 
cants.  Future  research  will  also  benefit  from  consideration  of  the 
predictive  validity  of  standardized  academic  assessments  for  those 
with  PSS.  including  whether  lower  standardized  le^i  scores  as  a 
result  of  PTSD  or  PSS  can  accurately  predict  future  academic 
performance. 

Our  findings  have  implications  for  the  interpretation  of  stan¬ 
dardized  achievement  assessinenl  differences,  parti cularly  among 
students  at  high  risk  for  PTSD  and  other  psychiatric  disorders  that 
might  iilfect  test-taking  ability.  Differences  in  ability  at  the  levels 
observed  in  thus  study  do  not  inevitably  imply  biases  sufficient  to 
necessitate  corrective  action.  However,  given  the  prevalence  of 
trauma  exposure  in  the  general  pi^pulaiton  and  the  ubiquity  of 
standardized  assessmenis  among  college  applicants*  this  study 
suggests  that  recognizing  and  understanding  the  potential  addi¬ 
tional  disadvantages  to  which  examinees  W'iih  PSS  are  subject  will 
he  important  lo  both  examinees  and  educational  counselors.  In 
particular,  prospective  college  students  with  PSS  may  benefit  from 
counseling  targeting  coping  strategies  to  help  manage  the  negative 
emotional  consequences  of  psychological  trauma  exposure  and 
compensatory  strategies  to  a.ssisi  in  test  taking. 
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